NUMBER 14 





LIGHT ON THE MYSTERY 


OF ADMISSIONS 


A DEAN LOOKS 


wes 2 os oa eee Frank R.Rille, |. 


MAKING TEST SCORES RECEIVEL 


MRM gs yy nn TF Gee 
1970 








ARIES 
1 


i 


{ lso in th 1S 1SSIUE Fees reduced for physically handicapped 


candidates 
Pre-induction scholarships Northwestern accepts May 21 date 
Fall Board meeting Board plans publication of two new manuals 


Age distribution of candidates Progress of General Composition Test 


A suggestion to colleges —— 


Regicnal conferences held in Evanston and More candidates this year 


Roanoke Board publications 


Special committees make progress reports Dates, tests, fees: 1951 














THE COLLEGE BOARD REVIEW 


News and Research of the 
College Entrance Examination Board 


Published three times a year by the 
College Entrance Examination Board 
425 West 117th Street, New York 27, N. Y. 


. Frank H. Bowles 
William C. Fels 


Director . 






Secretary 


Pre-induction scholarships 


In an experiment designed to extend the 
values of an integrated liberal education, a 
total of two hundred young men will be ad- 
mitted on scholarships to Chicago, Colum- 
bia, Wisconsin, and Yale in the year 1951— 
52. These scholarships, made possible by a 
grant from the Ford Foundation, will be of- 
fered to applicants of high academic attain- 
ment who must have completed not less 
than the roth grade and be not more than 
16% years of age on September 15, 1951. 

The scholarships will provide full tuition 
for two years and may, on grounds of finan- 
cial need, provide as much as $1,000 a year 
for expenses. 

Each applicant who has not already taken 
Board tests during the current year will be 
required to take the May series. 

The scholarships will permit a group of 
superior students to complete at least two 
years of college prior to their period of 
national service. 


Fall Board meeting 
The Fall 1951 meeting of the College En- 


trance Examination Board will be held at 
the Hotel Biltmore in New York City on 
October 31. 


Age distribution of candidates 


Defense regulations may make the ages of 
college applicants of crucial importance in 
1951. 

For the information of admissions officers, 
the age distribution of male Board candi- 
dates between the ages of 162 and 19% 
(as of September 1, 1951) who took the 
March, 1951 tests is tabulated below. 

Of the male candidates who took the tests 
54.8% will be 18 years of age, but only 
24.8% will be 18 years of age, and only 
9.2% will be 19 years of age. 


Distribution of Birth Dates of Male Candidates 
Taking the CEEB Series of March, 1951 


Month Year Cum. Percent 
March 1932 04.80 
April 1932 06.00 
May 1932 07.20 
June 1932 08.00 
July 1932 08.80 
August 1932 09.20 
September 1932 11.00 
October 1932 12.40 
November 1932 14.20 
December 1932 16.80 
January 1933 21.00 
February 1933 24.80 
March 1933 30.00 
April 1933 34.60 
May 1933 40.40 
June 1933 43.60 
July 1933 50.20 
August 1933 54.80 
September 1933 60.80 
October 1933 65.20 
November 1933 70.00 
December 1933 74.80 
January 1934 79.20 
February 1934 82.00 
March 1934 84.80 
April 1934 87.40 
May 1934 89.40 
June 1934 91.60 
July 1934 93.20 
August 1934 94.20 
September 1934 95.40 
October 1934 96.80 
November 1934 97.20 
December 1934 98.00 
January 1935 98.20 
February 1935 98.60 


[ 194 ] 











ADMISSIONS 


MAY 25 195] | ses 
of Minnesote 


Light on 








e Mystery of Admissions 


Edward S. Noyes 


Edward S. Noyes, a past Chairman of the Col- 
lege Entrance Examination Board, is Associate 
Professor of English and Chairman of the Board 
of Admissions at Yale University. 

“Light on the Mystery of Admissions” was 
originally delivered by Professor Noyes as an 
address under the title of “Academic and Non- 
academic Appraisal of Candidates.” It is pub- 
lished in somewhat shortened form by the Re- 
view in response to persistent requests that it 


be made available in print. 


A lighted candle might well symbolize that 
flickering gleam which is all the illumination I 
can hope to cast upon the dark mystery of my 
subject. The appraisal of candidates by colleges 
I call a dark mystery with deliberate intent, al- 
though the phrase needs qualification. To those 
who are admitted, there is no mystery. Quite 
confident that they were selected because they 
were the cream of the crop, they go on their 
way rejoicing. To any admissions officer, more- 
over, the appraisal of his own college’s appli- 
cants is not a dark mystery; he knows why 
many were called but few were chosen, although 
he may be unable or unwilling to explain the 
reasons behind the process. To one admissions 
officer, however, the appraisal of candidates by 
any other admissions officer may seem odd, if 
not precisely mysterious. These fortunate 
groups are exceptions. To all rejected candi- 
dates; to their parents, their sisters, their 
cousins, and their aunts; to the schools from 
which they came; to the public at large and to 
any alumnus after an unsuccessful football sea- 
son—to all these, how a college appraises its 
applicants is indeed a mystery, black as the 
night from pole to pole. For the wise admissions 
officer is he who emulates two of Uncle Remus’s 


famous characters: “Tarbaby ain’t sayin’ 
nothin’, and Brer Fox, he lay low.” 

You perceive that I am not a wise admissions 
officer, for I can’t “lay low,” and I have to say 
something. For your indulgence I can only ap- 
peal. I shall have to talk in generalities, though 
we all know that we have to appraise real indi- 
viduals, living boys and girls, not merely gen- 
eralizations, or paper records and test scores. 
Furthermore, admissions problems and proce- 
dures will vary from college to college. All my 
experience has been at one university; I should 
warn you at the start that no other admissions 
officer or committee is to be held, a priori, guilty 
of the theories or practices I am about to confess. 


ACADEMIC APPRAISAL 


Since the program divides my subject neatly 
into the academic and non-academic appraisal 
of candidates, I choose to take these sections in 
that order and in the present tense. Into the in- 
volved history of efforts to estimate the aca- 
demic fitness of candidates for college I do not 
propose to enter. At present, many colleges, in- 
cluding my own, have devised schemes for pre- 
dicting success or failure in college studies for 
their applicants; I suspect this is the chief type 
of appraisal. Such methods differ in nomencla- 
ture and in details from college to college, but 
they are based upon two main assumptions. The 
first is that a student will continue to produce, in 
college, at about the same rate as he did in 
school. Like all generalizations, this has notable 
exceptions, but it seems to hold true for the vast 
majority of students. The second assumption is 
a logical corollary: If we can measure, in our 
college terms, the student’s performance at 
school and on entrance tests, we can predict his 
scholastic attainment in freshman year. Such 
measurement requires the study of the record, 


[ 195 ] 





first at school, and then in college, of every stu- 
dent who has entered from a given school, lead- 
ing to a correlation between the two sets of 
records made by the same students in the two 
institutions. For me, this is the real mystery. I 
am too poor a mathematician to comprehend 
the techniques involved. I do know that these 
correlations need frequent revision in the light 
of the performance of the latest delegations from 
the school; that similar work must be done with 
test scores; and that the relative weight put on 
school record and test record will vary according 
to our experience with the school. Though I 
went through a period of doubt, I have become 
convinced that the prediction, summing up in a 
single figure our whole experience with a school 
and with the tests, is probably the most effective 
means of academic appraisal. At most of the col- 
leges which use it, freshman mortality, or drop- 
ping for low stand, has steadily decreased. 


FAILURE PREDICTIONS 





Chart showing approximate ranking of 2,700 candidates 
for admission, on basis of predicted freshman averages. 


This chart is meant to show roughly how a 
total group of nearly 3,000 applicants looks, ar- 
ranged in the order of their predictions for fresh- 
man averages. Jt is not drawn to scale. At the 
top, less than one per cent will have predictions 
for averages of 9o or better: highest freshman 
honors. At each level below, down to 70, there 
are more and more candidates who, as students, 
look precisely alike. We know that some will ex- 
ceed and others fall below their predictions, but 
we cannot identify in advance these eccentrics. 
You note that the tapering off is sharp below 


the 60 level. This is a fairly recent phenomenon, 
due, I feel sure, to the spread of guidance work 
in the schools. In the old days there were plenty 
of applicants who, by any method of appraisal, 
were not good prospects as college students. 
Now, each year, we find fewer and fewer with 
predictions of failure because the guidance of- 
ficers have not encouraged such candidates to 
apply; and the final selection of the class be- 
comes more and more difficult. For, after the 
probable failures have been eliminated, there 
are many more reasonably competent students 
than places for entering freshmen, and of these, 
large groups will be practically undistinguish- 
able. If we relied solely on predictions, and 
simply rolled down this figure, stopping when 
we had enough to form the class, we should find, 
at any point including enough for that class, on 
the lowest levels, more than enough to overflow 
it—all with predictions so much alike that no 
just differentiation is possible. If for no other 
reason, therefore, it is essential to have another 
kind of appraisal: to try to find out what kinds 
of persons we have in these reasonably com- 
petent students. 

Before I go on to the more difficult section of 
this paper, however, I should like to say one 
thing about the academic appraisal. In a predic- 
tion, test scores are weighted as a whole, and all 
five scores on the College Board battery gener- 
ally have less influence on the final prediction 
than the school record. This means that a candi- 
date will not suffer unduly if one of his test 
scores is low, providing that his school record is 
good and his other tests satisfactory. 

If I were holding a candle, its tiny beam would 
at this point flicker ominously, making the dark- 
ness of our mystery more apparent. I do not 
know of any reliable method of appraising char- 
acter, personality, and promise of future accom- 
plishment which enables one to make meaningful 
comparisons among hundreds of candidates. Of 
two boys recommended as good students by dif- 
ferent schools, we can feel fairly confident that 
the one with a 90 prediction is a better student 
than the one with a 75. But of two candidates 


[ 196 ] 











from different schools, each of whom is warmly 
recommended as an individual by his principal, 
we cannot tell which is the better person. We 
have moved from measurements largely objec- 
tive and quantitative to evaluations which are 
almost entirely subjective. Consider very briefly 
the sources of information for any non-academic 
appraisal. 


NON-ACADEMIC APPRAISAL 


By all odds the most important is the con- 
fidential report from the school. A short char- 
acter sketch, describing salient traits of the 
applicant, good or bad, and his main accom- 
plishments outside the classroom, if any, plus 
an informed guess about his capacity for de- 
velopment, is, when written by an experienced 
and interested person, the most valuable docu- 
ment in a candidate’s credentials. At the other 
extreme, in my own view, is the rating scale 
which dissects the individual into a number of 
qualities which, according to a consensus of his 
teachers’ opinions, he possesses in this or that 
degree. From a few scattered bones the pa- 
laeontologist can completely reconstruct a pre- 
historic animal. From an inventory of abstract 
qualities I find it more difficult to reconstruct 
the living boy. “Four out of five teachers think 
he is above average in initiative.” I don’t know 
what they mean by initiative, which may or 
may not be a trait desirable on the campus, de- 
pending upon how it is exhibited. Nor do I know 
how much of this quality the “average” boy is 
supposed to possess. You will understand, I am 
sure, that I am venting my own feelings about 
rating scales. Others may like them; they may 
be able to perform that feat of reconstruction 
which is so mysterious to me. 

The school statement is often illuminating 
about a particular candidate but, for obvious 
reasons, seldom can offer sound bases for com- 
parative appraisals. From it, however, colleges 
can usually detect those who are undesirable 
because of serious character defects, and most 
of those who are really outstanding as persons. 


Almost any kind of appraisal will pick out those 
two groups. Where we need more light is on the 
great mass: those who are neither undesirable 
nor genuinely outstanding. 

The next common source of information is the 
interview, whether handled by the admissions 
officer himself, or by some other representative 
of the college. Like the principal’s recommenda- 
tion, interviews are valuable in direct ratio to 
the intelligence and experience of those who con- 
duct and describe them. I know that some 
people do have an uncanny flair for sizing up 
candidates. The trouble is that every inter- 
viewer believes he possesses that faculty—and 
not all of us are right. He who, after twenty 
minutes of talk, feels that he has plumbed the 
depths of any candidate must be a hardy soul. 
Most interview reports, therefore, and I par- 
ticularly include my own, should be read with 
an attitude of enlightened skepticism. When 
they corroborate the information obtained from 
the school report, they are probably reliable; 
when they do not, a danger signal can be 
hoisted. They have this advantage: that one in- 
terviewer who meets a great many candidates 
from widely differing schools has a chance to 
make comparisons among those he has inter- 
viewed—always assuming that he can maintain 
his own standards! 


ADDITIONAL CRITERIA 


It is from these two sources, chiefly, that col- 
leges have to estimate their candidates as to 
soundness of character, degree of intellectual 
curiosity, ability to secure the respect of their 
contemporaries. Here I find myself forced to 
use a word I hoped to shun: leadership, about 
which more bunk has been written than about 
almost any word I can think of. I suspect that 
evaluation of leadership qualities is most often 
made by reviewing the candidate’s record in so- 
called activities. To what clubs did he belong; 
on what teams did he play; what did he do as 
editor, debater, actor, musician; what class of- 
fices did he hold? Do not misunderstand me; 


[ 197 ] 





I think the answers to these questions are in- 
teresting and valuable in our appraisal, but 
again I plead for an enlightened skepticism. 
Consider the class composed exclusively of lead- 
ers. Consider the effect on the big frog in a little 
puddle of being plunged into Lake Superior. 
And if one rates highly participation in activi- 
ties, what will one do with the lad who, because 
of financial need, has had jobs on afternoons and 
Saturdays, with no time for other extracurricu- 
lar pursuits! How can one appraise the boy who 
likes to hunt, or fish, or even read, and who has 
had enough independence not to become en- 
tangled with competitive activities of the stand- 
ard kind? Sufficiently well-known types are, on 
the one hand, the big activities man who never 
amounted to much after graduation, and, on the 
other hand, the chap who in school and college 


alike was inactive and inconspicuous, but who 
blossomed out into an impressive and influential 
personality in later life. 

Each college hopes that it is training future 
leaders. It is at least arguable that the boy who 
is a keen and successful participant in some 
worthwhile activity may be a better bet for fu- 
ture leadership than the one who, on the same 
academic level, has shown no outside interest. 
Colleges are therefore eager to know about such 
participation, but the tail should not be allowed 
to wag the dog, and it would seem that an effort 
to appraise genuine potential leadership through 
prominence in extracurricular activities in sec- 
ondary school has not always been fruitful. 

That candle is almost out. I use its last gleam 
to put together what I have, so far, kept sep- 


(Continued on page 211) 


A Dean Looks at Admissions 
Frank R. Kille 


Frank R. Kille is Dean of Carleton College and 
a member of the Executive Committee of the 


College Entrance Examination Board. 


The basic problems of college admissions are 
few, and they have changed but little through 
the years. One may assume that every college 
desires to select students who have the interest 
and capacity to do the work and live the life 
characteristic of its campus with satisfaction 
and maximum profit. To this end, admissions 
committees collect and weigh the applicant’s 
academic record, the evidence of his interest in 
and ability to do college work, the recommenda- 
tions on his character and personality, and the 
record of his extracurricular participation. 

But the decisions on admissions which each 
generation of educators faces are further com- 
plicated by the current economic conditions, the 
prevailing educational philosophies, and the 


various educational services which are available. 
Furthermore, each campus has its own peculiar 
traditions, convictions, and objectives which 
introduce still other variable factors. As a result, 
the details of admissions procedures and policies 
present a bewildering variety. 

Our topic is fortunately not so complex since 
our attention is focused on tests. However, even 
though our subject is limited, it would be pre- 
sumptuous of me to attempt to present the col- 
lege point of view. I shall be content to present 
a college point of view. In doing so, I will draw 
heavily on the only program I know in detail— 
the one in my own college. With the exception 
that every member of the entering class provides 
us with scores on both the aptitude and the 
achievement tests of the College Board, it is, for 
the most part, typical of a small-college pro- 
gram. It is not my intention to present a defense 
of this particular program but simply to use it in 


[ 198 ] 











| 
| 
1 
| 


order to touch on the main problems which are 
involved in college admissions. 

The best evidence that the student not only 
has ability to do college work but will apply 
himself once he is there can be found in the sec- 
ondary school record and in the recommenda- 
tion of his principal or adviser. Few would care 
to quarrel with that assumption. However, the 
situation which existed several years ago was in 
our opinion extremely difficult to handle with 
these data alone. 

Applicants with good records and good rec- 
ommendations exceeded the number of places 
available in the colleges of their choice. This 
presented the college with a serious public rela- 
tions problem. In former years if a college was 
forced to refuse admission to an able student, he 
could usually find a place elsewhere with ease. 
But that was not possible four years ago when 
enrollments were at their peak. Decisions made 
under those conditions worked a hardship on a 
few students, but I think we made fewer mis- 
takes by having all the information we could get 
on a candidate, particularly the Board scores. 

Time and again, a student with low class 
ranking from a small secondary school (or from 
a school with which we were not familiar) was 
able to substantiate the principal’s claim that 
the student deserved special consideration by 
presenting aptitude scores which compared 
favorably with those of students already ad- 
mitted from jarger, well-known schools. He 
might have stood low in class rank in a small 
class simply because his classmates happened to 
be unusually good. 

In contrast to such cases, very low test scores 
of other students often warned us that, although 
their grade records were good, they might run 
into trouble academically. We have had in- 
stances in which honor graduates of certain 
small schools made a very poor showing on the 
Board’s tests. Sometimes an explanation was 
found in unusual circumstances over which the 
student had no control. But in other cases, the 
high grades made in secondary school were in 
part the result of a lack of competition. The 


principal may have known that, but in a small 
community there is a limit to what he can say 
about an honor graduate of the local school. 
With the warning of the low Board scores which 
compared this student with thousands of other 
young people over the entire U.S. who hoped to 
enter college that year, we were able to give the 
student special counseling and adjust the aca- 
demic load at the very beginning of his college 
course. And if the scores were very low, as they 
have been on one or two occasions, the student 
was admitted on condition. The candidate did 
not always accept this offer, but in the long run 
the loss of an applicant is better than the failure 
of a freshman whose secondary school competi- 
tion gave him the false impression that he was 
doing exceptional work. 


APTITUDE SCORES USEFUL 


Another interesting thing is that, quite fre- 
quently among the men, we found Board apti- 
tude scores which compared favorably with 
those of other men whose school grades were 
better and which were often equal to those of 
girls whose grades were much better. In some 
of these instances, the principal or adviser knew 
the candidate well enough to say that he had not 
worked in his first three years, but now in his 
senior year he seemed to be under way. The 
scores substantiated the recommendation of the 
principal, and even though final course grades 
were not yet available, the boy could be con- 
sidered a good risk. 

Of course, some secondary schools can furnish 
aptitude test data. But the college problem is to 
handle an incoming class composed of students 
from all kinds of preparatory and public high 
schools, from schools of all sizes and in most of 
the states of the union. The committee gets 
hopelessly bogged down if it tries to compare the 
scores from all the different state and institu- 
tional testing programs, and also tries to make 
allowances for tests taken at different times and 
under different conditions. 

Like other colleges, we give a number of tests 


[ 199 ] 





during freshman orientation week but we do not 
want to wait until that date to plan our instruc- 
tion for the year or begin counseling and regis- 
tering the freshmen in courses. No doubt it could 
all be done in the fall, but the facilities of most 
small colleges would certainly be strained and I 
would think the initial enthusiasm for any such 
program of testing would rapidly diminish. 


BETTER ARTICULATION NEEDED 


In addition to advance warning on unusually 
poor students, we get advance warning also on 
unusually good ones. I suppose one of the great 
losses in higher education is the rather general 
neglect of this good group. We discuss the ar- 
ticulation of secondary school work with that of 
college, but we are frequently most concerned 
about the average student and give most of our 
time to the poorest ones. What about the able 
and seriously interested student who is well pre- 
pared to enter sophomore courses and yet is held 
in a framework of rigid requirements and must 
repeat work with which he is already familiar? 
Intellectually it is a demoralizing experience. 
Some colleges which are part of large universi- 
ties have at their disposal well-developed test- 
ing bureaus, but most small colleges do not have 
the facilities to handle the peak load of an enter- 
ing class. Nor have the general faculty much 
interest in pushing such a program if it has to be 
added to their regular assignments. 

Fortunately, our curriculum committee recog- 
nized years ago that the freshman has had an in- 
tellectual life before he comes to college and that 
if he is a student of unusual ability and interest, 
he should be given special academic privileges. 
To evaluate the course work in a multitude of 
secondary schools from the grade records alone 
was impossible. The Board scores on achieve- 
ment tests offered us just what we wanted—a 
standard test, given under uniform conditions, 
available any place in the country, protected by 
careful proctoring, and established by a policy- 
making body in which each participating college 
has a voice. 


Our first experience was with the foreign lan- 
guages. During one hour in the final week of the 
sophomore course which satisfies the college re- 
quirement in a foreign language, we used the 
Board’s reading test. After the instructor had 
made out his grades for the year, he was shown 
the scores made by his students on the Board 
test and the correlation of college grades and 
Board scores for the department as a whole. 
After a careful study, the departments agreed 
that we could use a Board score of a certain level 
just as well as the instructor’s grade when de- 
ciding whether a student had met the college 
graduation requirement of one foreign language. 
Of course, students always had had the privilege 
of asking for a departmental examination to 
gain exemption, but few took this initiative. In 
fact there was some feeling among faculty as 
well as students that the degree of difficulty of 
such an examination would vary greatly and 
would depend too much upon the particular in- 
structor giving the test. What we needed was an 
outside examining agency operated by a dis- 
interested party, in which students and all mem- 
bers of the faculty would have confidence, and 
we found that in the Board. 


PLACEMENT IN FRESHMAN ENGLISH 


Freshman English is another subject which is 
treated in a similar manner. High Board scores 
in verbal aptitude and in the achievement test 
in English composition, coupled with a good sec- 
ondary school record, qualify a student to go 
directly into the sophomore literature course if 
he wishes to do so. If he does not do satisfactory 
work, he is moved back into the freshman 
course. Most of the students do satisfactory 
work. The student who, under these conditions, 
completes the sophomore literature course suc- 
cessfully is considered to have fulfilled the two- 
year college requirement in English. 

This past fall, the faculty took a similar step 
in the natural sciences. All students who are 
graduated by Carleton College are required to 


pass two year-courses in the natural sciences— 


[ 200 ] 











one in the biological sciences and one in the 
physical sciences. After a careful study of high 
scores on a Board test in one of the sciences and 
of grades achieved by these students in college 
science courses, the science department recom- 
mended that high scores on a Board achieve- 
ment test in a natural science be accepted in lieu 
of one of the two college science requirements. 

In none of these instances have we given col- 
lege credit. However, the student gains elective 
subjects which in the case of an unusually well- 
prepared student may amount to four-fifths of 
a year’s work. I would not be surprised to see 
college faculties move in the direction of grant- 
ing a limited amount of college credit, once they 
are familiar with the Board’s examinations and 
have clearly defined their graduation require- 
ments. Should the Board develop more thorough 
achievement tests which would be given at the 
end of the academic year, colleges would be 
more likely to give serious consideration to 
granting of college credit to those who achieve 
high scores, particularly in skills such as the use 
of mathematics, English, and foreign languages. 

These trends toward the increased use of Col- 
lege Entrance Board tests in academic counsel- 
ing, in placement in sections of certain courses 
(a point which I have not developed at all), and 
in granting exemptions from certain college re- 
quirements have developed in response to a real 
academic need. The current national situation 
will encourage us all to give even more serious 
attention to these adjustments. We used to 
worry about securing a faculty and equipment; 
now we must add another worry—that of having 
time enough to do a quality job at the college 
level. If for many years to come our young men 
and perhaps our young women are to give two 
years of their lives to military service, the Board 
could perform no greater service than to make 
available tests of such quality that colleges 
would use them to grant some college credit to 
the unusually able and ambitious student. 

In closing, just to be sure I am not misunder- 
stood, I would like to restate the last suggestion 


in the form of four propositions: 


1. Students of unusual ability and interest 
can do some work of college calibre in their 
secondary schools. 


2. Evidence that they have done so can best 
be demonstrated by achievement tests 
taken at the end of their courses—tests 
which could also be given occasionally by 
the colleges in order that each college 
might establish a standard of its own. 


3. Whether credit hours are to be given for 
high scores in these tests or just exemp- 
tion from certain course requirements is 
a matter to be decided by each college. In 
either case the student would benefit, and 
the secondary school doing quality work 
would be encouraged. 


4. Achievement tests which would be satis- 
factory to educators in both secondary 
schools and colleges must meet certain 
criteria, among which are the following: 


a. The test should be offered to secondary 
school students nationally and, if pos- 
sible, internationally. 

b. The tests must be administered under 
strict provisions to guarantee full pro- 
tection of the contents. To be entirely 
free from criticism, the administration of 
the tests should be supervised by an 
agency responsive to, but independent 
of the schools and colleges. 

c. The tests must be developed by experts 
in test construction, but the experts 
must be assisted by the best thought of 
instructors in both secondary school and 
college and by adequate funds for re- 
search. This is essential not only to 
maintain the quality of the tests but to 
develop joint confidence in them. 


In view of its long experience with the prob- 
lems of developing and administering tests on a 
national scale and in view of the fact that it is 
a voluntary organization of colleges with sec- 
ondary school representation, the College Board 


is one agency to which some of us would turn. 


[ 201 ] 





Making Test Scores Meaningful 


William B. Schrader 


William B. Schrader is Head of the Validity 
Studies Section of the Statistical Analysis De- 


partment of the Educational Testing Service. 


By April 9, 1951, score reports on the March 
Board series for more than 33,000 boys and girls 
from all over the world had been sent to hun- 
dreds of colleges and universities. For each test, 
the average candidate earned a score not much 
different from 500, and about two-thirds of the 
candidates earned scores between 400 and 600. 
As the table on the back of each score report in- 
dicated, each standard rating* between 200 and 
800 corresponded to a specific percentage of can- 
didates surpassed. Table 1, which applies to all 
College Board tests, shows the “per cent of can- 
didates receiving a lower rating” for various 
standard ratings. 

TABLE 1 


College Board Standard Ratings as an Index 
of Relative Excellence of Performance 


Standard Percent of Candidates 
Rating Receiving a Lower Rating 

Soo 100 

75° 99 

700 98 

650 93 

600 84 

550 69 

500 50 

450 31 

400 ; 16 

350 7 

300 2 

250 I 

200 ae : . 5 ‘ ° 


Since a candidate’s performance on each test 
he takes is referred to that of the “standard 
group of College Board candidates,” his scores 
on the different tests may be compared with 
each other and his special strengths and weak- 
nesses identified. The test user, moreover, in 


*For the information of the more technically trained reader, 
an account of the process by which the candidate’s raw score is 
translated into a standard rating appears at end of this article 
(p. 207). 


comparing candidates who took different tests, 
need not be concerned about the fact that some 
tests seem to attract students of higher general 
ability than others. Nor does the test user need 
to take special account of the fact that the gen- 
eral ability of candidates varies from one series 
to another, since the candidates at each series are 
compared with the “standard group of candi- 
dates.” Moreover, a college may judge whether 
the ability level of its applicants (and of its stu- 
dents) is rising or falling from year to year. 

Some teachers who use a percentage scale in 
grading their own students may wonder why 
College Board test scores are not reported on a 
percentage basis. Probably the chief obstacle to 
the use of such a scale in a broad program is that 
the concept of perfect performance is ambigu- 
ous. To secure a definition of perfect perform- 
ance which is consistent from test to test and 
from year to year seems nearly impossible. By 
contrast, the average performance of a defined 
candidate group is a reference point which com- 
bines flexibility with definiteness. For example, 
suppose that the test given in a particular sub- 
ject is a bit more difficult than usual. The indi- 
vidual candidates will make lower original 
scores, but the average original score will also 
be correspondingly lower. The standard ratings 
are so computed that candidates who take a 
more difficult test are not penalized. Use of the 
average performance of a standard group as a 
reference point is thus admirably fitted to a 
complex, continuing test program. 

Certain idiosyncrasies of the reporting system 
may be mentioned in passing. Scores which 
range from 200 to 800 are not likely to be con- 
fused with intelligence quotients—even for the 
most superior group of candidates—nor can 
they be mistaken for percentage scores. Users of 
the scores, however, have been cautioned not to 
suppose that the use of three digits implies 


[ 202 ] 








—— es 





——P es 





greater precision than the test actually has. A 
further rule sets 800 and 200 as the two extremes 
of the scale because differences in scores beyond 
these limits are not sufficiently precise to war- 
rant reporting the exact numbers. Thus, no mat- 
ter how much more successful a candidate may 
be than all the others, his score cannot be higher 
than 800 and no candidate has the misfortune to 
score below 200. 

Several ETS staff members have made special 
studies, reported in the College Board Review, 
concerning the use of additional information 
about candidates in interpreting standard rat- 
ings. William Turnbull (in the Spring, 1947 is- 
sue) reported that no special allowance need be 
made for various arrangements of topics in sec- 
ondary school physics in interpreting scores on 
the Physics Achievement Test in the April series. 
Ledyard Tucker (in the Spring, 1948 issue) de- 
scribed the scaling plan established for foreign 
language Achievement Tests, designed so that 
“each year of training beyond two will, on the 
average, yield a 60-point rise in score.” Thus, 
a user can make allowance for the amount of 
training a candidate has had. Richard Pearson 
(in the Fall, 1948 issue) reported that a candi- 
date who repeated the saT—verbal would or- 
dinarily gain 20 points or less attributable to 
practice in taking the test. On the other hand, 
he concluded that because of growth in verbal 
ability, a candidate who took this test in June 
of his junior year probably earned a score about 
35 points below the one he would earn in April 
of his senior year. 

These studies make it clear that the system of 
standard ratings does not solve all problems in 
score interpretation. A number of considerations 
must still be kept in mind by the admissions 
officer in addition to the numerical values of the 
scores when he evaluates the performance of a 
specific candidate on a particular test. For ex- 
ample, the time at which a candidate takes a 
test is related to his age and maturity, the 
amount of preparation which he will have com- 
pleted, and the recency with which he will have 
received the various parts of his preparation. 


Sex differences may be highly significant with 
some tests, such as the Spatial Relations Test. 
It may be of interest to note that there was a 
difference of 96 points between the average score 
for boys and the average for girls on one of the 
recent spatial tests. Thus, the system of stand- 
ard ratings, however useful, must be regarded 
as an aid to, rather than as a substitute for in- 
dividual judgment of the meaning of a set of 
test results. 


HOW EXACT ARE 
COLLEGE BOARD STANDARD RATINGS? 


Numerous studies of the accuracy with which 
College Board tests are scored have shown that 
this process meets very stringent standards. The 
use of objective tests in by far the greater part 
of the program minimizes the possibility of er- 
rors of judgment. Test experts, however, recog- 
nize and make allowance for the fact that no 
test score is a perfect index of the abilities which 
the test measures. Elimination of this source of 
error is (by definition) impossible, and its re- 
duction beyond a certain point generally re- 
quires an unreasonable amount of effort and 
expense. Fortunately, a rather good idea of the 
amount of such error which is likely to appear 
in a set of test scores can be obtained from the 
standard error of measurement. Information on 
this point is very useful in judging the degree of 
confidence which may be placed in test scores, 
and therefore in judging the value of the test 
itself. 

The logic underlying the standard error of 
measurement is quite straightforward. Suppose 
that a large number of different one-hour tests 
of the sAT—verbal type were prepared, and that 
each of a group of students took all of these one- 
hour tests under suitable conditions. It is clear 
that the average score which a student earned 
on these tests would be better as an index of his 
“true” ability than would any single test. Fol- 
lowing up this line of reasoning, we might define 
a student’s true score as his average perform- 
ance on an unlimited number of tests similar to 


[ 203 ] 





the test which is being considered. The extent 
to which the actual scores on the test would 


differ from such true scores is indicated by the 
standard error of measurement. The size of the 
differences between actual and true scores will 
vary from one student to another. It turns out 
that the difference between the score actually 
earned and the true score is less than the stand- 
ard error of measurement for about 2 students 
in every 3 and less than twice the standard error 
of measurement for about 19 students in 20. 

For example, if we imagine a student with a 
true score of 550 on a test which has a standard 
error of measurement of 25 points, the chances 
are about 2 in 3 that his score will be between 
525 and 575 on the test. The chances are 19 in 20 
that he will make a score between 500 and 600. 

Standard errors of measurement are reported 
for each of the objective College Board tests in 
the College Board Annual Report. Systematic 
attention is devoted to keeping the standard er- 
rors of measurement low; and this effort has 
been well repaid. In March, 1950, the standard 
error of measurement for SAT—verbal was 24 
points; for other tests, it varied from 18 points 
to 39 points. It may properly be said that a stu- 
dent’s standard rating establishes the general 
level of performance of the student in the abili- 
ties measured by the test. Intelligent caution 
must be exercised, however, in interpreting the 
scores to avoid exaggerating the importance of 
small differences in the test performance of in- 
dividual students. 


PREDICTIVE VALUE OF RATINGS 


Many users of College Board tests are much 
more interested in what a student will do than 
in what he has done. Indeed, for such purposes 
as selection and scholarship awards, the meas- 
urement of a student’s accomplishment is useful 
chiefly because experience has shown that past 
performance along appropriate lines is an excel- 
lent indication of his future promise. Thus, it is 
often more important to be able to say: “This 
man is a promising engineering student” than 


to be able to say: “This man has an excellent 
background in mathematics.” Because it is 
harder to predict future performance than to 
describe present standing in a particular field, 
careful test users make such predictions only 
when they have sound evidence that the test is 
actually related to later success. Hunches, per- 
sonal opinions, and even common sense do not 
provide an adequate basis for predicting success 
from test scores. 

The effectiveness of a test for a particular 
purpose is termed its validity for that purpose. 
The limitation of validity to a specific purpose is 
no mere quibble; the sAT—verbal score may be 
highly valid for predicting success in English 
but quite lacking in validity for predicting suc- 
cess in engineering drawing. Careful statistical 
study is generally needed before the suitability 
of a test for a particular purpose can properly 
be judged. For example, a recent study showed 
that the SAT-mathematical score was more ef- 
fective than the verbal score in predicting first- 
term engineering grades in four College Board 
member colleges; in the fifth, however, the SAT- 
verbal score was superior. 


STUDYING TEST VALIDITY 


In the last analysis, the value of a test de- 
pends on the aid that it gives in understanding 
individual boys and girls. We may well begin 
our consideration of test validity by consider- 
ing 141 students who entered an Ohio college 
in the fall of 1948. As it happened, 24 of these 
students made relatively poor college records 
during their first semester of college work. How 
many of these students could have been identi- 
fied as being in the lowest 24 students on high 
school rank? How many could have been 
spotted if we used high school rank and College 
Board test scores to pick the 24 least promising 
students? 

Table 2 shows what happens when we split 
the group into “good” and “poor” students in 
this way. In all, the high school rank would 
identify 7 of the 24 poor students while the com- 


[ 204 ] 














bined predictors would locate 11. We notice also 
that there are 30 students among the group 
where the two predicting methods differ; the 
combined predictors are “right” for 19 of these 
students, while high school rank is “right” for 
11 of these individuals. 


TABLE 2 


How 141 students were classified as “good” (top five-sixth of 

class) and “poor” (bottom one-sixth of class) on the basis of (1) 

high school rank, (2) high school rank combined with test score, 
and (3) actual first-semester grades 


Prediction Prediction 
by High by Rank Actual Number 
School Rank and Test Grades of 
Alone Combined Earned Students 
Good Good Good gI 
Poor Good Good 13 
Good Poor Good 9 
Poor Poor Good 4 
Good Good Poor II 
Poor Good Poor 2 
Good Poor Poor 6 
Poor Poor Poor 5 


When decisions as important as admission to 
college are at stake, it is essential to secure the 
best possible basis of prediction to aid the ad- 
missions officer or committee in deciding on ac- 
ceptance or rejection of individual candidates. 
Of course, the evidence considered in this ex- 
ample would be supplemented by additional 
data in reaching such decisions in practice. 

Among the numerous questions which are 
raised in determining whether test scores are 
appropriate for a particular use, three may be 
judged to be especially important. First, does 
the test contribute anything worthwhile to the 
prediction of success? Second, which of two tests 
is the better predictor of a particular kind of 
success? Third, how may scores on tests and 
other predictors be combined to give the best 
over-all estimate of success? 

The numerous investigators of these signifi- 
cant problems have brought much ingenuity 
and statistical expertness to bear on their solu- 
tion. Among the many methods applied, the 
correlational method is undoubtedly the most 
popular. This method yields a definite numeri- 
cal index, the validity coefficient, of the close- 
ness of relationship between test scores and later 


success. Errors of prediction, of course, lower 
the validity coefficient; the more serious the 
error, the more the reduction of the coefficient. 
Thus, if validation is “testing the test,” the va- 
lidity coefficient may be regarded as the “score” 
which the test earns. 

The interpretation of validity coefficients in- 
volves many considerations. Three matters of 
primary concern should be carefully weighed be- 
fore drawing any conclusions. First, if the num- 
ber of students included in the research is small, 
and especially if fewer than 100 students are in- 
cluded, the results must be viewed with distinct 
caution. Second, it is important to know whether 
or not the students were selected for admission 
on the basis of the abilities measured by the test. 
If so, the validity coefficient obtained is an un- 
derestimate of the value of the test for selecting 
students. A part of the value of the test has been 
“used up” in the selection process. Most of the 
really poor prospects, whose failure could have 
been predicted with confidence, were never ad- 
mitted to the college and necessarily were 
omitted from the study. Available methods for 
estimating the effect of competitive admissions 
on test validity indicate that such effects are 
substantial in many College Board member col- 
leges. Third, the dependability of the measure 
of success used in the study should be carefully 
considered; grades in some colleges or in some 
subjects are more dependable than those in 
others. These difficulties suggest that it is quite 
risky to judge that one test is better than an- 
other without a carefully planned study of the 
problem. 

For practical purposes, the value of particular 
tests as predictors of academic grades for highly 
selected groups of college freshmen may be 
judged according to the following rough scale: 
Validity Coefficient 
.60 or above 


Interpretation 


Excellent (but seldom obtained) 


.45 to .59 Satisfactory 
25 to .44 Possibly useful in a predictive team 
.00 to .24 Of doubtful value, or useless 


It should be added that validity coefficients of 
.60 or higher are excellent for existing tests, but 
they are far below what one might hope for 


[ 205 ] 





under more utopian circumstances. To the pes- 
simist, a correlation of .60 means that the quali- 
ties which make for excellence in test score and 
the qualities which produce academic success 
are by no means identical. Moreover, some con- 
ditions which may affect a student’s grade (e.g., 
whether or not he falls violently in love during 
the semester, the agreement—or the lack of it— 
between his objectives and his teacher’s objec- 
tives, the studiousness of his roommate) may be 
inherently unpredictable. To the optimist, a cor- 
relation of .60 means that the use of the test in 
selection will result in a group of students which 
is decidedly superior to the applicant group. 
Figure 1 illustrates the relationship between test 
score and success for a validity coefficient of .60. 


COMBINING PREDICTORS 


One of the merits of correlational procedures 
is that they make it easy to determine how vari- 
ous predictive measures, e.g., SAT-verbal, SAT- 
mathematical, and some measure of high school 
success, may be combined into an over-all esti- 





RELATIVE STANDING PER CENT 
ON TEST OF 

STUDENTS 0 
HIGHEST 4% 4 


{BELOW HIGHEST 4% } 
ABOVE LOWEST 84% { 
{BELOW HIGHEST 16% | 
ABOVE LOWEST 50% { 
{BELOW HIGHEST 50% | 
ABOVE LOWEST 16% { 
{BELOW HIGHEST 84% } 
ABOVE LOWEST 4% { 


the morning program and high school rank in 
class, a predicted grade is obtained. This is the 
grade which a candidate is most likely to make. 
Table 3 carries the procedure one step farther by 
estimating the chances in 100 that a student 
with any given predicted grade (from Figure 1) 
will earn any specified actual grade. 


TABLE 3 


Chances in 100 that Students with Various Predicted Grades 

Will Equal or Excel Various First-Semester Average Grades 

(Prediction Based on SAT-V, SAT—M, and High School Record) 
(Validity of Best-Weighted Combination = .75) 





Predicted Chances in 100 of Equalling or Excelling a Grade of: 
Grade F D Cc B A 
A 9+ g9t 9+ 96 50 
B 99+ 99+ 96 50 4 
[ 99 + 96 50 4 o-+ 
D 96 50 4 o+ ot 
F 50 4 o+ ot o+ 





The predictors obtained from Figure 2 and Table 3 should, of 
course, be used in conjunction with other evidence of promise in 
evaluating any particular candidate. 


The higher the validity coefficient, the higher 
is the proportion of good students in the high- 


scoring group, and the lower the proportion of 
good students in the low-scoring group. Even 


CHANCES IN 100 THAT A STUDENT WILL EARN 
AVERAGE OR BETTER COLLEGE GRADES 


20 40 60 80 100 


94) JMNM 
1283 

34 (37) SAMA UAT 

12 (17) 


LOWEST 4% a (6) (i 


FIGURE 1. Chances in 100 that students who score at various levels of excellence on a test 
will earn average or better college grades when the test has a validity of .60. 


mate of success in college. These over-all esti- 
mates are often decidedly better than those 
obtained from a single predictor. Figure 2 is a 
graphical device for this purpose, based on data 
supplied by a College Board member college. 


For any given combination of test scores on 


with a validity coefficient as low as .40, an ex- 
cellent group may be obtained if only the few 
students with the highest scores are chosen. 
With higher validity coefficients, however, a 
larger proportion of applicants can be chosen 
without any sacrifice in academic standards. For 


[ 206 ] 














800 


700 


500 


400 


RANK IN CLASS (STANDARD SCORE) 


300 


600 800 1000 1200 1400 





1600 1g00 2000 2200 2400 


2 (SAT VERBAL SCORE) + (SAT MATHEMATICAL SCORE) 


FIGURE 2. Abac for estimating first-semester average grade—applicable only to the college which supplied data. (The 
Board publishes a booklet explaining conversion of secondary school rank in class to the standard score scale.) 


this reason, among others, the size of the valid- 
ity coefficient is of much importance in inter- 
preting test scores. 


SUMMARY 


Wise interpretation of test scores will never 
be a purely mechanical process. Nevertheless, 
much is being done to help test users to reach 
correct decisions with greater ease. The College 
Board standard ratings enable test users to 
apply their past experience with College Board 
tests to the results of a new series by making 
scores comparable from test to test and from 
year to year. Standard errors of measurement 
remind the test user to doubt the significance of 
small differences in test scores. Validity studies 
show which tests are relevant to specific predic- 
tion problems and offer ways of combining sev- 
eral predictors efficiently in a team. The effort 


devoted to making test scores meaningful, then, 
contributes directly by reducing errors of inter- 
pretation and indirectly by saving time for the 
test user. If the time thus freed is spent in weigh- 
ing subtle, intangible elements in the promise of 
individual boys and girls, the indirect contribu- 
tion of aids to score interpretation is most signi- 
ficant. 


Translation of Raw Scores 
into Standard Ratings 


Mechanically, the process of translating raw scores into 
standard ratings is quite simple. The student’s answer sheet 
is scored, the scoring checked, and when necessary re- 
checked. This produces the original or “raw” score. Using a 
list giving for each possible raw score on each College 
Board test the corresponding standard rating, it is a 
straightforward process, using International Business Ma- 
chines, to substitute the appropriate standard rating for 
the original raw score. The standard rating is now ready to 
be reported to the test user. 


[ 207 ] 





This mechanical explanation, unfortunately, glosses over 
the critical step in the sequence of events. The crucial 
question is: What system is used in assigning an appro- 
priate standard rating to each raw score on a particular 
College Board Test? This question will require a somewhat 
lengthier answer. 

We may begin by quoting from the laconic “Interpreta- 
tion of Ratings” which appears on the back of the score 
report form: 


“All examinations are reported on a standard scale on 
which the rating of the average candidate of the standard 
group is 500. Standard ratings range from 200 to 800; 
two-thirds of the candidates receive ratings between 400 
and 600 (i.e., the standard deviation is 100).” 


The goal which is sought, then, in translating raw scores 
is to produce a new set of scores (the standard ratings) 
which will permit comparison of each candidate with a 
standard group of candidates. To see how this goal is 
reached, we need to consider briefly the terms “standard 
group” and “standard deviation.” 

The standard group plays a key role in setting the 
scheme of standard ratings for each test. Experienced 
teachers write test questions which will be fair measures 
of the ability of the standard group. Results of special 
pretests of these questions are evaluated in terms of how 
well the standard group would perform on these questions. 
Finally, careful statistical adjustments are brought to bear, 
after the tests have been given, to make allowance for 
differences between the group which actually took a given 
test and the standard group. It is hardly necessary to add 
that the standard group is fictitious as a group, but indis- 
pensable as a guiding concept in test development and in 
defining the scale of standard ratings for a test. Its exist- 
ence is embodied in a huge mass of statistical information 
about questions which have been used in College Board 
tests in the past ten years and in the conceptions of candi- 
date training and ability held by the persons who create the 
tests. 

A rather technical statistical tool, the standard deviation, 
also plays an important part in the process of assigning 
standard rating values to raw scores. Essentially, the 
standard deviation is a mathematically convenient index 
number for representing the amount of spread of scores 
above and below the average score. If the scores bunch 
closely around the average, the standard deviation will be 
small. If the scores spread widely in both directions, the 
standard deviation will be large. 

The standard deviation has one quality which is very 
convenient in test work: Suppose that we multiply or di- 
vide every student’s raw score on a given test by the same 
number; the standard deviation of the resulting scores will 
be the original standard deviation multiplied or divided by 
that number. This makes it easy to stretch out a bunched- 
up set of scores or to draw together an excessively spread- 


out set of scores. In particular, if we divide every raw score 
by the standard deviation of all raw scores and multiply the 
result by 100, the standard deviation of the resulting set 
of scores will also be 100. 

We are now ready to consider how the standard scale 
for any College Board test is developed. Using evidence 
from a number of sources, steps are taken to compare per- 
formance on, say, the Social Studies Test given in the 
March, 1951 series with other tests in the same series and 
with Social Studies Tests taken in earlier test series. This in 
turn makes it possible to calculate what the mean and 
standard deviation of raw scores would be if the standard 
group had taken the test. Once the mean and standard 
deviation of raw scores for the standard group is known, it 
is only necessary to subtract this average raw score from 
each possible raw score, divide the result by the standard 
deviation of raw scores, multiply by 100, and finally add 
500 to each result. Thus, the standard rating appropriate to 
each raw score on the test is settled. Of course, this process 
is carried through once and for all for any one test, and is 
much simpler to do algebraically than to describe verbally. 


800 
120 
100 
90 
ORIGINAL 600 —_( sTaNDARD 
SCORE 80 RATING 
70_s00 
60 
50-1400 
40 
304300 
20 
10-L_900 


Relationship between original scores on one College Board 
test and the standard ratings reported to test users. 


The relationship between original “raw scores” and 
standard ratings may be compared to the relationship be- 
tween the Centigrade and Fahrenheit scales of tempera- 
ture. Each score on one scale is equivalent to some score 
on the other and the translation from one to the other can 
be made in either direction. There is, however, a new and 
different relationship for each new test since it is rare for 
the raw score characteristics of two tests to be alike, 
whereas the standard scale remains constant. The original 
scores are related primarily to the characteristics of the 
test. The standard ratings are related primarily to the 
ability of the group of persons who take the test. The ac- 
companying figure illustrates the relationship between the 
original and converted scores for one College Board test. 


[ 208 ] 














A suggestion to colleges 


We are glad to publish the following excerpt 
from a letter received by the Director from 
Miss Alice Gouled, Guidance Director of the 
Weehawken, New Jersey, High School, in 
the hope that its publication will help to 
further the practice Miss Gouled advocates: 
Can the College Entrance Examination Board use its 
influence to recommend that all colleges adopt the practice 
of certain colleges like Columbia, Rutgers, and Cornell in 
notifying the high schools concerning the disposition made 
of the student’s application? It is a common complaint of 
guidance officers that students “forget” to report informa- 
tion about whether or not they have been accepted or 
rejected at the college to which they made application. 
Since follow-up is so important a part of adequate 
guidance services, notification by the colleges either in 


the form of a simple postcard or a copy of the letter sent 
to the student by the college is indeed very helpful. 


Regional conferences held in 
Evanston and Roanoke 


In the interests of school and college cooper- 
ation and of the understanding of the use of 
tests, the College Board this spring spon- 
sored two regional conferences on problems 
of college admission. The two conferences, 
held in Evanston, Illinois, on Saturday, 
March 3, and in Roanoke, Virginia, on Sat- 
urday, March 17, were attended by more 
than 200 representatives of member and 
nonmember colleges and of public and inde- 
pendent schools. In addition, the Board’s 
officers visited colleges and schools in the 
West and Northwest. 

The regional conferences and visits were 
substituted for the Board’s regular spring 
meeting in New York. 

No regional meetings were held in New 
England or the Middle Atlantic States, 
where schools and colleges are more familiar 
with the Board and its services. 


Special committees make 
progress reports 


Three subcommittees of the Committee on 
Examinations presented progress reports 
to the Executive Committee at its April 
meeting. 

The Subcommittee on Achievement Tests 
made 17 specific recommendations which 
were referred to the Board’s administrative 
officers for study with instructions to report 
in October. The Board’s officers were also 
instructed to poll teachers’ opinions con- 
cerning the committee’s recommendations. 

The Subcommittee on Science Testing 
will prepare an experimental form of a new 
science test and will make recommendations 
concerning the revision of the Board’s pres- 
ent science tests. 

A report of progress by the Subcommittee 
on Testing in English is presented on p. 210. 


Fees reduced for physically 
handicapped candidates 


Henceforth the Board will undertake to re- 
lieve physically handicapped candidates of 
the costs of special supervisors. 

In the interests of proper supervision, the 
Board has always required that its super- 
visors should be selected by the Board, paid 
by the Board, and responsible to it. In the 
past, handicapped candidates, except in 
cases where financial hardship would have 
resulted, have been asked to bear the cost of 
special supervision. Now the Board will bear 
this cost in all cases, whether or not financial 
hardship is indicated. Candidates will, of 
course, be required to pay the regular fees 
and to provide their own amanuenses. 


[ 209 ] 





Northwestern accepts 


May 21 date 


Northwestern University has notified the 
Secretary of the Board that it wishes to be 
added to the list of 70 colleges subscribing to 
the May 21 uniform acceptance date agree- 
ment. The list of colleges and the text of the 
agreement were published in the February 
Review. 


Board plans publication of 
two new manuals 


Plans for the publication of two new man- 
uals designed to provide additional informa- 
tion about the nature and interpretation of 
the Board’s tests were approved by the 
Executive Committee at its meeting on 
April 4. One manual, on the nature of the 
examinations, will include a considerable 
number of test items and will be issued at 
intervals of three or more years. A second 
manual, on the interpretation of the tests, 
will be issued evéry two years. Each issue of 
the second manual will provide interpreta- 
tive data on the Board’s tests in one or two 
of the subject matter fields. 

The manual on the nature of the tests will 
be similar to an expanded Bulletin of Infor- 
mation. It will give consideration to the 
underlying principles upon which each test 
is based and will illustrate, by means of spe- 
cific items, the manner in which the prin- 
ciples have been applied. Further, the diff- 
culty of each test will be shown in terms of 
the median and quartiles for a standard 
Board population. 

The first part of the manual on the inter- 
pretation of the tests will be designed for the 
reader with little training in the technical 





aspects of testing. It will discuss the form of 
the score distribution, the score scale, the 
reliability of the scores, and evidences of 
validity. Percentile norms for various group- 
ings of candidates, together with a discus- 
sion of their possible use, will be given. The 
second part of this manual will include much 
of the material now contained in the Annual 
Report, supplemented by additional statis- 
tical tables. 

The two manuals are to be similar in for- 
mat and will be considered companion vol- 
umes, even though they may not be issued 
together. They will be related to each other 
by cross-references. 

The manuals should prove particularly 
useful to admissions officers, secondary 
school administrators, guidance personnel, 
and college deans. They will provide data 
which the individual test consumer is often 
unable to secure and which are frequently 
needed to make the Board’s test results 
thoroughly useful. 


Progress of General 
Composition Test experiment 


A “pilot study” of the new General Com- 
position Test is now under way. Four experi- 
mental forms of a three-hour essay test have 
been prepared by Dr. Earle G. Eley of the 
University of Chicago in cooperation with 
the Educational Testing Service. The tests 
have been distributed to a representative 
sample of public and independent schools 
for administration to their students this 
spring. During the month of June a reading 
conference will be held. Twenty readers have 
been invited. They will undergo a period of 
training at the start of the conference and 
will then read the papers of the 800 students 
involved in the experiment. A detailed 


[ 210 | 











study of test validity and reliability and of 
reader reliability will be conducted. 

Early in May the Board will also distrib- 
ute single copies of the tests together with 
an explanation of the experiment to member 
colleges and to 500 schools which last year 
sent 50 or more candidates to the Board’s 
tests. Other colleges and schools may obtain 
copies by writing to the Secretary of the 
Board. Colleges and schools will be free to 
reproduce the tests for their own use. There 
will be no charge. 

No definite decision will be made as to the 
place of the test in the Board’s program un- 
til the results of the pilot study have been 
analyzed, but tentative plans have been 
made to offer the test on the afternoon of the 
regular series in May 1952. The May series 
is used by more than 15,000 candidates of 
whom the majority are preliminary (junior) 
candidates. Both final and preliminary can- 
didates would have the option in the after- 
noon of taking either the new test or the 
present Achievement Tests. 


More candidates this year 


The Board, at its December, January, and 
March series, examined a larger total num- 
ber of candidates this year than it did last 
year. Frank H. Bowles, Director of the 
Board, explained that this is good evidence 
that guidance counselors have been advising 
students to complete their requirements for 
college before entering military service. 
“Whether it is also evidence that counselors 
are advising students to enter college before 
induction,” Mr. Bowles said, “is difficult to 
say, but informal reports from. Board col- 
leges indicate that they are not expecting a 
startling drop in male freshman enrollment 
next fall.” 


Mystery of Admissions (continued) 


arate. We are now a college admissions commit- 
tee in the process of selecting a class. The first 
step is elimination: by means of the predictions, 
of those headed for failure; by means of the re- 
ports, of those who appear personally undesir- 
able. In this step we shall undoubtedly make 
some errors—not, one hopes, of significant 
scope. Our real task remains. There are three 
groups we should be sure to admit: those few 
who seem to be tops on both counts; those with 
high predictions who are satisfactory as persons; 
and those who are first-rate persons and at least 
satisfactory as students. As we move down our 
academic scale, we shall sooner or later reach 
areas where the candidates look very much 
alike: still competent students, but by no means 
Phi Beta Kappas; still good lads, but, as yet, 
not distinguished. In a private college, in these 
areas, we should probably give preference to 
sons of alumni as, in a state university, we 
should probably give preference to our own 
state’s citizens. And we might also give prefer- 
ence to those hailing from regions whence a 
larger representation is desirable. I grant at once 
that the chances of paternity and of residence 
have little connection with intrinsic merit. But 
we have reached groups where candidates can 
hardly be distinguished, one from another, by 
merit, academic or otherwise; some other cri- 
teria are necessary, for decisions have to be 
made. Here is where we shall make most of our 
errors. Our only comfort is that such errors will 
be inevitable, until some methods of appraising 
and comparing the personal qualities of candi- 
dates have been devised which at least approxi- 
mate the accuracy with which the academic ap- 
praisal can now be made. Such methods may be 
discovered. About them, however, my feelings 
are a little like those of Tennyson’s Ulysses: 


“All experience is an arch wherethrough 
Gleams that untravelled world whose 
margin fades 
Forever and forever when I move.” 


[ 211 ] 








Board Publications 


Annual Report of the Director, 1949. De- 
scription of Board activities, lists of mem- 
bers, examiners, readers. Contains a new 
section, “Data for Interpreting the Tests.” 
84 pages. $.50. 


Bulletin of Information and Sample Tests. 
Advice to candidates and parents, dates of 
examinations, registration and fees, de- 
scription of tests, sample questions. 56 


pages. Free. 


College Board Review. News and research 
of the College Entrance Examination 
Board. Subscription: one year, $.50; two 
years, $1. Hard-covered, looseleaf binders 
for the Review stamped in gold leaf are 
available at cost, $2. 


Handbook, 1949, and Supplement, 1950. 
Terms of admission to the member col- 
leges. 296 pages (Handbook). $1.50. (Sup- 


plement out of print.) 


The College Board, Its First Fifty Years. 
By Claude M. Fuess. “The full story of the 
College Entrance Examination Board’s 
contribution to twentieth-century educa- 
tion in America.” Published by Columbia 
University Press, New York, 1950. 224 


pages. $2.75. 
School Lists. Mimeographed lists of public 


and independent secondary schools send- 
ing an appreciable number of candidates 
to the Board’s examinations. 9 pages. Free. 


Order from the Secretary, College En- 
trance Examination Board, 425 West 117 
Street, New York 27, N. Y. 














Dates, Tests, Fees: 1951 


EXAMINATION DATES 


May 19, 1951 
August 15, 1951 


December 1, 1951 


EXAMINATION PROGRAMS* 
Morning Program 


Scholastic Aptitude Test 
(Verbal Section) 
(Mathematical Section) 


Afternoon Program 


Achievement Tests 
(a maximum of three afternoon tests) 


English Composition § Chemistry 
Social Studies Physics 

French Reading Intermediate 
German Reading Mathematics 
Latin Reading Advanced 
Spanish Reading Mathematics 
Biology 


Aptitude Tests 


Pre-Engineering Science Comprehension 
Spatial Relations 


EXAMINATION FEES 


Morning Program and 
Afternoon Program 

Morning Program only 

Afternoon Program only 





* The College Transfer Test, for students transferring 
from one college to another, is offered on the same dates 
and at the same centers as the College Entrance Tests. It 
is administered in the morning. The fee is $6. Bulletins 
of Information and application blanks for the College 
Transfer Test will be sent upon request. Address the Col- 
lege Entrance Examination Board, Box 592, Princeton, 
N. J., or Box 9896, Los Feliz Station, Los Angeles 27, Cal. 











